translation error
Is Image-based Object Pose Estimation Ready to Support Grasping?
Joyce, Eric C., Zhao, Qianwen, Burgdorfer, Nathaniel, Wang, Long, Mordohai, Philippos
We present a framework for evaluating 6-DoF instance-level object pose estimators, focusing on those that require a single RGB (not RGB-D) image as input. Besides gaining intuition about how accurate these estimators are, we are interested in the degree to which they can serve as the sole perception mechanism for robotic grasping. To assess this, we perform grasping trials in a physics-based simulator, using image-based pose estimates to guide a parallel gripper and an underactuated robotic hand in picking up 3D models of objects. Our experiments on a subset of the BOP (Benchmark for 6D Object Pose Estimation) dataset compare five open-source object pose estimators and provide insights that were missing from the literature.
- North America > United States > Utah (0.04)
- North America > United States > New Jersey > Hudson County > Hoboken (0.04)
- North America > Canada > Alberta > Census Division No. 13 > Woodlands County (0.04)
- North America > United States > New Jersey > Mercer County > Princeton (0.04)
- North America > Canada > British Columbia > Vancouver (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Data Science (0.92)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Cycle-Sync: Robust Global Camera Pose Estimation through Enhanced Cycle-Consistent Synchronization
Li, Shaohan, Shi, Yunpeng, Lerman, Gilad
We introduce Cycle-Sync, a robust and global framework for estimating camera poses (both rotations and locations). Our core innovation is a location solver that adapts message-passing least squares (MPLS) -- originally developed for group synchronization -- to camera location estimation. We modify MPLS to emphasize cycle-consistent information, redefine cycle consistencies using estimated distances from previous iterations, and incorporate a Welsch-type robust loss. We establish the strongest known deterministic exact-recovery guarantee for camera location estimation, showing that cycle consistency alone -- without access to inter-camera distances -- suffices to achieve the lowest sample complexity currently known. To further enhance robustness, we introduce a plug-and-play outlier rejection module inspired by robust subspace recovery, and we fully integrate cycle consistency into MPLS for rotation synchronization. Our global approach avoids the need for bundle adjustment. Experiments on synthetic and real datasets show that Cycle-Sync consistently outperforms leading pose estimators, including full structure-from-motion pipelines with bundle adjustment.
- Europe > Switzerland > Zürich > Zürich (0.14)
- Oceania > Australia > New South Wales > Sydney (0.04)
- North America > United States > Utah > Salt Lake County > Salt Lake City (0.04)
- (11 more...)
- North America > United States > New Jersey > Mercer County > Princeton (0.04)
- North America > Canada > British Columbia > Vancouver (0.04)
CaLiV: LiDAR-to-Vehicle Calibration of Arbitrary Sensor Setups
Tahiraj, Ilir, Edinger, Markus, Kulmer, Dominik, Lienkamp, Markus
In autonomous systems, sensor calibration is essential for safe and efficient navigation in dynamic environments. Accurate calibration is a prerequisite for reliable perception and planning tasks such as object detection and obstacle avoidance. Many existing LiDAR calibration methods require overlapping fields of view, while others use external sensing devices or postulate a feature-rich environment. In addition, Sensor-to-Vehicle calibration is not supported by the vast majority of calibration algorithms. In this work, we propose a novel target-based technique for extrinsic Sensor-to-Sensor and Sensor-to-Vehicle calibration of multi-LiDAR systems called CaLiV. This algorithm works for non-overlapping fields of view and does not require any external sensing devices. First, we apply motion to produce field of view overlaps and utilize a simple Unscented Kalman Filter to obtain vehicle poses. Then, we use the Gaussian mixture model-based registration framework GMMCalib to align the point clouds in a common calibration frame. Finally, we reduce the task of recovering the sensor extrinsics to a minimization problem. We show that both translational and rotational Sensor-to-Sensor errors can be solved accurately by our method. In addition, all Sensor-to-Vehicle rotation angles can also be calibrated with high accuracy. We validate the simulation results in real-world experiments. The code is open-source and available on https://github.com/TUMFTM/CaLiV.
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Robots (1.00)
UniCalib: Targetless LiDAR-Camera Calibration via Probabilistic Flow on Unified Depth Representations
Han, Shu, Zhu, Xubo, Wu, Ji, Cai, Ximeng, Yang, Wen, Yu, Huai, Xia, Gui-Song
Precise LiDAR-camera calibration is crucial for integrating these two sensors into robotic systems to achieve robust perception. In applications like autonomous driving, online targetless calibration enables a prompt sensor misalignment correction from mechanical vibrations without extra targets. However, existing methods exhibit limitations in effectively extracting consistent features from LiDAR and camera data and fail to prioritize salient regions, compromising cross-modal alignment robustness. To address these issues, we propose DF-Calib, a LiDAR-camera calibration method that reformulates calibration as an intra-modality depth flow estimation problem. DF-Calib estimates a dense depth map from the camera image and completes the sparse LiDAR projected depth map, using a shared feature encoder to extract consistent depth-to-depth features, effectively bridging the 2D-3D cross-modal gap. Additionally, we introduce a reliability map to prioritize valid pixels and propose a perceptually weighted sparse flow loss to enhance depth flow estimation. Experimental results across multiple datasets validate its accuracy and generalization,with DF-Calib achieving a mean translation error of 0.635cm and rotation error of 0.045 degrees on the KITTI dataset.
Semiotic Complexity and Its Epistemological Implications for Modeling Culture
Stine, Zachary K., Deitrick, James E.
The use of computational methods in the study of cultural artifacts--from models like linear regression and artificial neural networks, to how we evaluate and interpret those models--can be usefully understood as a kind of translation work from a complex, cultural medium into a formal, computational medium. Research questions arise in the cultural domain within culturally-embedded minds. When a researcher designs a computational model to aid in answering such a question, they translate from the cultural into the computational in each modeling decision they make. After completing this first translation problem, the researcher then makes use of the model by interpreting it (either directly or in downstream outputs that depend on it), requiring a second translation to be made, now from the computational going back into the cultural, by way of culturally-embedded researchers making sense of them. In these bidirectional translation problems, we as researchers want to ensure that our translations are reasonable, that they can be sufficiently evaluated and understood by others engaged in collective knowledge-building. Yet translation work can vary in the complexity required to interpret and evaluate it. Consider, for example, how evaluating a translation of "hello" into modern Mandarin Chinese is much simpler than evaluating a translation of a text from classical (i.e., literary) Chinese, like the Zhuangzi, into This preprint article is currently under review.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- North America > United States > Arkansas (0.05)
- North America > United States > New York (0.04)
- (5 more...)
- Research Report > New Finding (0.50)
- Research Report > Experimental Study (0.35)
PlaneHEC: Efficient Hand-Eye Calibration for Multi-view Robotic Arm via Any Point Cloud Plane Detection
Wang, Ye, Jing, Haodong, Liao, Yang, Ma, Yongqiang, Zheng, Nanning
Hand-eye calibration is an important task in vision-guided robotic systems and is crucial for determining the transformation matrix between the camera coordinate system and the robot end-effector. Existing methods, for multi-view robotic systems, usually rely on accurate geometric models or manual assistance, generalize poorly, and can be very complicated and inefficient. Therefore, in this study, we propose PlaneHEC, a generalized hand-eye calibration method that does not require complex models and can be accomplished using only depth cameras, which achieves the optimal and fastest calibration results using arbitrary planar surfaces like walls and tables. PlaneHEC introduces hand-eye calibration equations based on planar constraints, which makes it strongly interpretable and generalizable. PlaneHEC also uses a comprehensive solution that starts with a closed-form solution and improves it withiterative optimization, which greatly improves accuracy. We comprehensively evaluated the performance of PlaneHEC in both simulated and real-world environments and compared the results with other point-cloud-based calibration methods, proving its superiority. Our approach achieves universal and fast calibration with an innovative design of computational models, providing a strong contribution to the development of multi-agent systems and embodied intelligence.
- Asia > China > Shaanxi Province > Xi'an (0.04)
- Asia > China > Guangxi Province > Nanning (0.04)
TranslationCorrect: A Unified Framework for Machine Translation Post-Editing with Predictive Error Assistance
Wasti, Syed Mekael, Hung, Shou-Yi, Collins, Christopher, Lee, En-Shiun Annie
Machine translation (MT) post-editing and research data collection often rely on inefficient, disconnected workflows. We introduce TranslationCorrect, an integrated framework designed to streamline these tasks. TranslationCorrect combines MT generation using models like NLLB, automated error prediction using models like XCOMET or LLM APIs (providing detailed reasoning), and an intuitive post-editing interface within a single environment. Built with human-computer interaction (HCI) principles in mind to minimize cognitive load, as confirmed by a user study. For translators, it enables them to correct errors and batch translate efficiently. For researchers, TranslationCorrect exports high-quality span-based annotations in the Error Span Annotation (ESA) format, using an error taxonomy inspired by Multidimensional Quality Metrics (MQM). These outputs are compatible with state-of-the-art error detection models and suitable for training MT or post-editing systems. Our user study confirms that TranslationCorrect significantly improves translation efficiency and user satisfaction over traditional annotation methods.
- North America > Canada > Ontario > Toronto (0.14)
- Asia > Singapore (0.05)
- North America > United States > Florida > Miami-Dade County > Miami (0.04)
- (5 more...)
In-Hand Object Pose Estimation via Visual-Tactile Fusion
Nonnengießer, Felix, Kshirsagar, Alap, Belousov, Boris, Peters, Jan
-- Accurate in-hand pose estimation is crucial for robotic object manipulation, but visual occlusion remains a major challenge for vision-based approaches. This paper presents an approach to robotic in-hand object pose estimation, combining visual and tactile information to accurately determine the position and orientation of objects grasped by a robotic hand. We address the challenge of visual occlusion by fusing visual information from a wrist-mounted RGB-D camera with tactile information from vision-based tactile sensors mounted on the fingertips of a robotic gripper . Our approach employs a weighting and sensor fusion module to combine point clouds from heterogeneous sensor types and control each modality's contribution to the pose estimation process. We use an augmented Iterative Closest Point (ICP) algorithm adapted for weighted point clouds to estimate the 6D object pose. Our experiments show that incorporating tactile information significantly improves pose estimation accuracy, particularly when occlusion is high. Our method achieves an average pose estimation error of 7.5 mm and 16.7 degrees, outperforming vision-only baselines by up to 20%. We also demonstrate the ability of our method to perform precise object manipulation in a real-world insertion task. In-hand pose estimation describes the process of determining the position and orientation of an object held within a robotic hand.
- Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.04)
- Europe > Germany > Hesse > Darmstadt Region > Frankfurt (0.04)